Global Convergence of Policy Gradient Primal–Dual Methods for Risk-Constrained LQRs
نویسندگان
چکیده
While the techniques in optimal control theory are often model-based, policy optimization (PO) approach directly optimizes performance metric of interest. Even though it has been an essential for reinforcement learning problems, there is little theoretical understanding its performance. In this article, we focus on risk-constrained linear quadratic regulator problem via PO approach, which requires addressing a challenging nonconvex constrained problem. To solve it, first build our earlier result that time-invariant affine structure to show associated Lagrangian function coercive, locally gradient dominated, and local Lipschitz continuous gradient, based establish strong duality. Then, design primal–dual methods with global convergence guarantees both model-based sample-based settings. Finally, use samples system trajectories simulations validate methods.
منابع مشابه
Global Convergence of Policy Gradient Methods for Linearized Control Problems
Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model 2) they are an “end-to-end” approach, directly optimizing the performance metric of interest 3) they inherently allow for richly parameterized policies. A notable drawback is th...
متن کاملGlobal Convergence Properties of Conjugate Gradient Methods for Optimization
This paper explores the convergence of nonlinear conjugate gradient methods without restarts, and with practical line searches. The analysis covers two classes of methods that are globally convergent on smooth, nonconvex functions. Some properties of the Fletcher-Reeves method play an important role in the first family, whereas the second family shares an important property with the Polak-Ribir...
متن کاملconditional copula-garch methods for value at risk of portfolio: the case of tehran stock exchange market
ارزش در معرض ریسک یکی از مهمترین معیارهای اندازه گیری ریسک در بنگاه های اقتصادی می باشد. برآورد دقیق ارزش در معرض ریسک موضوع بسیارمهمی می باشد و انحراف از آن می تواند موجب ورشکستگی و یا عدم تخصیص بهینه منابع یک بنگاه گردد. هدف اصلی این مطالعه بررسی کارایی روش copula-garch شرطی در برآورد ارزش در معرض ریسک پرتفویی متشکل از دو سهام می باشد و ارزش در معرض ریسک بدست آمده با روشهای سنتی برآورد ارزش د...
Global Convergence of Conjugate Gradient Methods without Line Search
Global convergence results are derived for well-known conjugate gradient methods in which the line search step is replaced by a step whose length is determined by a formula. The results include the following cases: 1. The Fletcher-Reeves method, the Hestenes-Stiefel method, and the Dai-Yuan method applied to a strongly convex LC objective function; 2. The Polak-Ribière method and the Conjugate ...
متن کاملGradient Convergence in Gradient Methods
For the classical gradient method xt+1 = xt − γt∇f(xt) and several deterministic and stochastic variants, we discuss the issue of convergence of the gradient sequence ∇f(xt) and the attendant issue of stationarity of limit points of xt. We assume that ∇f is Lipschitz continuous, and that the stepsize γt diminishes to 0 and satisfies standard stochastic approximation conditions. We show that eit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2023
ISSN: ['0018-9286', '1558-2523', '2334-3303']
DOI: https://doi.org/10.1109/tac.2023.3234176